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Preface 



To understand the evolution of things, one must un- 
derstand something about their history as well as the 
environmental forces that had shaping influences upon 
them. Information Foraging Theory evolved through 
a series of fortuitous historical accidents, as well as a 
number of enduring shaping forces. A critical event 
was my move to the Palo Alto Research Center 
(PARC). Soon after I came to PARC at the beginning 
of 1992, 1 became involved in trying to develop studies 
and models around a set of projects that were collec- 
tively called intelligent information access. This in- 
cluded the novel information visualization systems 
investigated in the User Interface Research Area (see, 
e.g., Card et al, 1999) as well as the new techniques 
for browsing and searching being created in the 
Quantitative Content Area (e.g., Rao ct al, 1995). As 
part of this effort, a group of us (including Stu Card, 
Dan Russell, Mark Stefik, and John van Gigch from 
California State University — Sacramento) were run- 
ning some quick-and-dirty studies of people such as 
business intelligence analysts and MBA students. Our 



studies of people doing information-intensive work 
started to give me some sense of the range of phe- 
nomena that we would need to address. Our study 
participants clearly were faced with massive volumes 
of information, often under deadline conditions, and 
making complex search decisions based on assess- 
ments that were enveloped in a great deal of un- 
certainty. 

These information-intensive tasks seemed to be 
different than the human-computer interaction tasks 
that were being addressed by cognitive engineering 
models in the early 1990s, or the science, math, and 
programming tasks addressed by intelligent tutoring 
systems of that same period. Such cognitive models 
addressed tasks that tended to occur in task environ- 
ments that (although large and complex) were well 
defined by a circumscribed domain of possible goals, 
elements of domain knowledge (e.g., about Lisp pro- 
gramming, algebra, word processing), and potential 
actions (e.g., in a formal language, or in a user inter- 
face). In contrast, the behavior of people seeking 
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information appeared to be largely shaped by the 
structure or architecture of the content— the in- 
formation environment —and only minimally shaped 
by the users knowledge of user interface. In addition, 
the structure of the information environment was 
fundamentally probabilistic. Consequently, behavior 
was also dominated by choices made in the face of 
uncertainty and the continual evaluation of the ex- 
pected costs and benefits of various actions in the in- 
formation environment, in contrast to the near-certain 
costs and benefits of actions taken in traditional cog- 
nitive modeling domains of the time. 

It was clear that it was going to be a challenge to 
develop theories for information-intensive tasks. Mul- 
ling about this issue, I was drawn to work in two areas in 
which I had done some reading, llie first was the work 
in the late 1980s of John R. Anderson (e.g., Anderson, 
1990), who was putting forth the argument that to un- 
derstand mechanisms of the mind, one must first try to 
figure out the environmental problems that it solves. 
John developed the method of rational analysis and 
applied this approach to memory, categorization, and 
other areas of cognition with considerable success. I 
wondered if the approach could be applied to the 
analysis of the information environment and how it 
shapes information seeking behavior. 1 'I Tie second area 
of interest was behavioral ecology (e.g., Smith, 1987), 
which suggested that very diverse strategies adopted by 
people could be systematically predicted from optimi- 
zation analysis that focused first on scrutiny of the en- 
vironment. This particular interest of mine originated 
as an undergraduate at Trent University, where phy- 
siological psychology included coverage of ethology 
(the precursor to behavioral ecology) and anthropology 
included what is known as cultural materialism (the 
precursor to current evolutionary-ecological approaches 
to anthropology). Working through the literature in 
these areas, I was led to optimal foraging theory, and 
particularly to the book by Stephens and Krebs (1986) 
that is the source of the conventional models discussed 
in chapter 2. 1 quite literally had an "ah-ha" experience 
in the middle of a late-night conversation with Jacqui 
LeBlanc in which I laid out the basic analogies between 
information foraging and optimal foraging theory. 

In July 1992, 1 wrote a working paper titled "Notes 
on Adaptive Sense Making in Information Ecolo- 
gies," which discussed the possible application of 
conventional foraging models and the core mathe- 
matics of Stephens and Krebs to idealized informa- 
tion foraging tasks. The working paper got two kinds 



of reactions. The first was one of disbelief in the 
analogy, for a variety of relatively good reasons (e.g., 
humans are not rational, information is not food). 
'Ihc second was that the ideas were "audacious" (to 
quote Jock Mackinlay). Fortunately, Stu Card (my 
manager and colleague in the User Interface Re- 
search Area) pushed me to pursue this approach, and 
he has been my main sounding board for the devel- 
opment of the theory over the years. By the fall of 
1993, I had enough material to present a seminar at 
the University of California— Berkeley called "Sense 
Making in Complex Information Ecologies." 

In the decade that followed, the fruitfulness of In- 
formation Foraging Theory was apparent from the way 
that it could be used to bring messy data into crystal 
clear focus. The first time this happened was in ap- 
plication to the Scatter/Gather study presented in 
chapter 6. Simple analyses of the logs of users inter- 
acting with the system seemed to indicate that users 
where behaving in a nonsystematic way in their allo- 
cation of time or in their choices of interface actions. 
The application of optimal foraging models resulted 
in another of those "ah-ha" experiences in which 
suddenly the data plots all fell neatly on lines pre- 
dicted by theory. Like catching a perfect wave in 
surfing, the feeling one gets from that moment when 
one gains power over a small portion of the universe is 
hard to recount without the skill of poetry (which I do 
not have), and it is the reward that keeps you coming 
back. 
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Note 
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Information Foraging Theory 

Framework and Method 

Knowledge is power. 
— Sir Francis Bacon, 
Meditationes Sacraz. 
De Haresibus (1S97) 



Modern mankind forages in a world awash in infor- 
mation, of our own creation, that can be transformed 
into knowledge that shapes and powers our engage- 
ment with nature. This information environment has 
coevolved with the epistemic drives and strategies 
that are the essence of our adaptive toolkit, The result 
of this coevolution is a staggering volume of content 
that can be transmitted at the speed of light. This 
wealth of information provides resources for adapting 
to the problems posed by our increasingly complex 
world. However, this information environment poses 
its own complex problems that require adaptive 
strategies for information foraging. This book is about 
Information Foraging Theory, which aims to explain 
and predict how people will best shape themselves for 
their information environments and how information 
environments can best be shaped for people. 

Information Foraging Theory is driven by three 
maxims attributable in spirit, if not direct quotation, 
to Allen NewelFs (1990) program of Unified Theories 
of Cognition: 1 



1 . Good science responds to real phenomena or real 
problems. Human psychology has evolved as an 
adaptation to the real world. Information forag- 
ing theory is concerned with understanding rep- 
resentative problems posed by the real-world 
information environment and adaptive cogni- 
tive solutions to those problems. 

2. Good science makes a difference. Information 
Foraging Theory is intended to provide the 
basis for application to the design and evalu- 
ation of new technologies for human interac- 
tion with information, such as better ways to 
forage for information on the World Wide 
Web. 

3. Good science is in the details. The aim is to 
produce working formal models for the anal- 
ysis and prediction of observable behavior. 

Like much of Newell's work, the superficial ele- 
gance and simplicity of these maxims unfurls into 
complex sets of entailments. In this book I argue 
that the best approach to studying real information 
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4 INFORMATION FORAGING THEORY 

foraging problems is to adopt methodological adap- 
tationism, which directs our scientific attention to the 
ultimate forces driving adaptation and to the proxi- 
mate psychological mechanisms that are marshaled 
to produce adaptive solutions. Thus, the methodol- 
ogy of Information Foraging Theory is more akin to 
the methodology of biology than that of physics, in 
contrast with the historical bulk of experimental psy- 
chology. To some extent, this choice of methodology 
is a consequence of the success with which Informa- 
tion Foraging Theory has been able to draw upon 
metaphors, models, and techniques from optimal 
foraging theory in biology (Stephens & Krebs, 1986). 
The concern with application (Newell & Card, 1985) 
drives the theory to be relevant to technological de- 
sign and evaluation, which requires that models be 
truly predictive a priori (even if approximately so) 
rather than a "good fit" explanation of the data a pos- 
teriori, as is the case with many current psychological 
models. Being concerned with the details drives the 
theory to marshal a variety of concepts, tools, and 
techniques that allow us to build quantitative, pre- 
dictive models that span many levels of interrelated 
phenomena and interrelated levels of explanation. 
This includes the techniques of task analysis through 
state-space and problem-space representations, ratio- 
nal analysis and optimization analysis of adaptive 
solutions, and production system models of the cog- 
nitive systems that implement those adaptive sol- 
utions. 



Audience 

The intent of this book is to provide a comprehensive 
presentation of Information Foraging Theory, the 
details of empirical investigations of its predictions, 
and applications of the theory to the engineering and 
design of user interfaces. This book aims primarily at 
an interdisciplinary audience with backgrounds and 
interests in the basic and applied science aspects of 
cognitive science, computer science, and the infor- 
mation and library sciences. The theory and method- 
ology have been developed by drawing upon work 
on the rational analysis of cognition, computational 
cognitive modeling, behavioral ecology, and micro- 
economics. The crucible of empirical research that 
has shaped Information Foraging Theory has been 
application problems in human-information inter- 
action, which is emerging as a new branch in the 



field traditionally known as human-computer inter- 
action. Although the emphasis of this book is on the- 
ory and research, the insights and results are intended 
to be relevant to the practitioner interested in a deeper 
understanding of information-seeking behavior and 
guidance on new designs. Chapter 9 is devoted en- 
tirely to practical applications of the theory. 

By its nature, Information Foraging Theory in- 
volves the use of technical material such as mathe- 
matical models and computational models that may 
not be familiar to a broad audience. Generally, the 
technical aspects of the theory and models are pre- 
sented along with succinct discussion of the key 
concepts, insights, and principles that emerge from 
the technical parts, along with illustrative examples, 
metaphors, and graphical methods for understanding 
the key points. The aim of this presentation is to pro- 
vide intuitive understanding along with technical pre- 
cision and insight. 

Frameworks, Theories, and Models 

Like other programs of research in the behavioral and 
cognitive sciences, Information Foraging Theory can 
be discussed in terms of the underlying framework, 
the theory itself, and the models that specify predic- 
tions in specific situations. Frameworks are the gen- 
eral pools of concepts, assumptions, claims, heuris- 
tics, and so forth, that are drawn from to develop 
theories, as well the methods for using them to un- 
derstand and predict the world. Often, frameworks 
will overlap. For instance, information processing 
psychology is a broad framework that assumes that 
theories about human behavior can be constructed 
out of information processing concepts, such as pro- 
cesses that transduce physical sensations into sensory 
information, elements storing various kinds of infor- 
mation, and computational processes operating over 
those elements. A related framework, connectionism, 
shares these assumptions but makes additional ones 
about the nature of information processing being 
neuronlike. Although bold claims may be made by 
frameworks, these are typically not testable in and of 
themselves. For instance, whether the mind is mostly 
a general purpose learning machine or mostly a col- 
lection of exquisitely evolved computational modules 
are not testable claims in and of themselves. 

Theories can be constructed within frameworks 
by providing additional assumptions that allow one to 
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action. Although the emphasis of this book is on the- 
ory and research, the insights and results are intended 
to be relevant to the practitioner interested in a deeper 
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eral pools of concepts, assumptions, claims, heuris- 
tics, and so forth, that are drawn from to develop 
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are not testable claims in and of themselves. 
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make predictions that can be falsified. Typically, this is 
achieved by specifying a model for a specific situation 
or class of situations that makes precise predictions 
that can be fit to observation and measurement. For 
instance, a model of information seeking on the Web 
(SNIF-ACT) is presented in chapter 5 that predicts 
the observed choice of Web links in given tasks. It 
includes theoretical specifications of the information 
processing model of the user, as well as assumptions 
about the conditions under which it applies (e.g., 
English-speaking adults seeking information about un- 
familiar topics). The bulk of this book is about Infor- 
mation Foraging Theory and specific models. The 
aim of this introductory chapter is to provide an out- 
line of the underlying framework and methodology 
in which Information Foraging Theory is embedded. 
However, before presenting such abstractions, a simple 
example is offered in order to illustrate the basic ele- 
ments and approach of Information Foraging Theory. 

Illustration 

The basic approach of Information Foraging Theory 
can be illustrated with a simple example that I hope 
is familiar to many, involving the task of finding a 
good, reasonably priced hotel using the World Wide 
Web (Pemberton, 2003). A typical hotel Web site 
(see figure 1.1) will allow a user to search for avail- 
able hotels in some specified location (e.g., "Paris") 
and then allows the user to sort the results by the 
hotel star rating (an indicator of quality) or by price 
(but not both). The user must then click-select each 
result to read it, because often the price, location, and 
features summaries are inaccurate. Lamenting the 
often poor quality of such hotel Web sites, Pem- 
berton (2003) suggested that improved "usability is 
about optimizing the time you take to achieve your 
purpose, how well you achieve it, and the satisfaction 

in doing it How fast can you find the perfect 

hotel?" This notion of usability is at the core of In- 
formation Foraging Theory. 

For illustration, consider the somewhat simplified 
and idealized task of finding a low-priced, two-star 
hotel in Paris. 2 This example shows (in much sim- 
plified form) the key steps to developing a model of 
information foraging: (a) a rational analysis of the task 
and information environment that draws on optimal 
foraging theory from biology and (b) a production 
system model of the cognitive structure of task. 



FRAMEWORK AND METHOD 5 

Rational Analysis of the Task 
and Information Environment 

Figure 1.2 presents an analysis of results of search for 
two-star Paris hotels that I conducted on a popular 
hotel Web site. The Paris hotel descriptions and 
prices were returned as a vertical list presented over 
several Web pages. I sorted the list by star rating and 
went to the page that began to list two-star hotels. In 
figure 1.2, the x-axis indicates the order of two-star 
hotel listings in the search result list when sorted 
by star rating, beginning at the first two-star hotel 
through the last two-star hotel, and the y-axis indi- 
cates price. Prices fluctuate as one proceeds down the 
list of Paris hotels. As noted above, this particular 
hotel Web site, like many others, does not allow the 
user to sort by both quality (star rating) and price — 
one must choose one or the other sorting. Assume a 
rational (and perhaps somewhat boring) hotel shop- 
per who was concerned only with being frugal and 
sleeping in a two-star hotel. If that shopper method- 
ically scanned the two-star hotel listings, keeping 
track of only the lowest priced hotel found so far, the 
lowest price encountered would decrease as plotted 
in figure 1.3. That is, the shopper would at first find a 
relatively rapid decrease in lowest price, followed by 
fewer improvements as the scan progressed. Figure 
1.4 shows the savings attained (compared with the 
very first hotel price found on the list) by continuing 
to scan down the list. Figure 1.4 is a typical dimin- 
ishing returns curve in which additional benefits 
(returns) diminish as one invests more resources (in 
this case, scan time). 

A diminishing returns curve such as figure 1.4 
implies that the expected value of continuing to scan 
diminishes with each additional listing scanned. If 
the list of search results were very long— as is often 
the case with the results produced by Web search 
engines — there is usually a point at which the infor- 
mation forager faces the decision of whether it is 
worth the effort of continuing to search for a better 
result than anything encountered so far. In the par- 
ticular example plotted in figure 1.2, there were no 
additional savings for the last 18 items scanned. 
Figure 1.3 includes a plot of the expected minimum 
price encountered as a function of scanning a search 
result list, and figure 1.4 includes a plot of the ex- 
pected savings as a function of scanning. TTiese ex- 
pectations were computed assuming that observed 
hotel prices in figure 1.2 come from a standard 
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figure 1.1 A typical Web page from a hotel search site. 



distribution of commodity prices (see the appendix 
for details). Assuming that our hypothetical rational 
hotel shopper valued time (time is money), the ques- 
tion would be whether the savings expected to be gained 
by additional scanning of hotel results was worth the 
time expected to be expended. 

In contrast to this simple illustration, typical in- 
formation problems solved on the Web are more 



complicated (Morrison, Pirolli, & Card, 2001), and 
the assessments of the utility of encountered items in 
information foraging depend on more subtle cues than 
just prices. However, the basic problem of judging 
whether continued foraging will be useful or a waste 
of valuable time is surely familiar to Web users. It 
turns out that this problem is very similar to one class 
of problems dealt with in optimal foraging theory. 
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figure 1.2 Prices of two-star Paris hotels in the 
order encountered in the results of a search of a hotel 
Web site. 
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figure 1 .4 Diminishing returns of savings as a func- 
tion of list order. The observed savings is the differ- 
ence between the observed minimum price found so 
far and the first price encountered ($110), presented 
in figure 1.3. The expected savings is the difference 
between the expected minimum price and first price 
encountered. 
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List Order 

figure 1.3 The minimum two-star Paris hotel price 
as a function of order of encounter. The observed 
prices are the same as those in figure 1.2. The observed 
minimum is the least expensive hotel price found so 
far in a process that proceeds through the prices in the 
order listed. The expected minimum is a prediction 
based on the assumption that prices are being sequen- 
tially and randomly sampled from a fixed distribution 
of prices (see the appendix for details). 



An Optimal Foraging Analogy 

Many animals forage in patchy environments, with 
food arranged into clumps. For instance, a bird that 
feeds on berries in bushes will spend part of its time 
searching for the next bush and part of its time berry 
picking after having found a bush. Often, as an ani- 
mal forages in a patch, it becomes harder to find food 
items. In other words, foraging within a food patch 
often exhibits a diminishing returns curve similar to 
the one in figure 1.5. Such diminishing returns may 
occur, for instance, because prey actively avoid the 
forager as they become aware of the threat of preda- 
tion. Diminishing returns may also occur because the 
forager has a strategy of picking off the more highly 
profitable items first (e.g., bigger berries for the hy- 
pothetical bird) from a patch with finite resources. 
Like the hypothetical Web shopper discussed above, 
the problem for a food forager facing diminishing 
returns in a patch is whether to continue investing 
efforts in getting more out of the patch, or to go look 
for another patch. 

Figure 1.5 is a graphical version of a simple con- 
ventional patch model (Stephens & Krebs, 1986) based 
on Charnov's Marginal Value Theorem (Charnov, 
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Gain 
(energy) 




figure 1.5 Charnovs Marginal Value Theorem 
states that the rate-maximizing time to spend in 
patch, r* occurs when the slope of the within-patch 
gain function g is equal to the average rate of gain, 
which is the slope of the tangent line R*. 



1976). The model depicted in figure 1.5 assumes that 
an animal foraging for food encounters only one kind 
of food patch at random that is never reencountered. 
When searching for the next food patch, it takes 
an average of t B amount of time to find the next 
patch (between-patch time). Once a patch is en- 
countered, foraging within the patch returns some 
amount of energy (e.g., as measured by calories) that 
increases as a function, g, of the time, f\v> spent for- 
aging within the patch. Figure 1.5 shows a diminish- 
ing returns function, g, for within-patch foraging. The 
problem for the forager is how much time, tw, to 
spend within each patch before leaving to find the 
next patch. 

The conventional patch model assumes that the 
animal forager optimizes the overall rate of gain, R, 
that characterizes the amount of energy gained per 
unit time of foraging: 



Charnovs Marginal Value Theorem (Charnov, 
1976) is a mathematical solution to this problem of 
determining f*. It basically says that a forager should 
leave a patch when the rate of gain within the patch 
[as measured by the slope of g(t w ) or more specifi- 
cally the derivative g'(t w )} drops below the rate of 
gain that could be achieved by traveling to, and for- 
aging in, a new patch. That is, the optimal forager 
obeys the rule, 

if g'(t w )>R*, then continue foraging in the 
patch; otherwise, 

when g'(f w )<R*, then start looking for a new 
patch. 

Charnovs Marginal Value Theorem can be illus- 
trated graphically in figure 1.5 for this simple prob- 
lem (one kind of patch, randomly distributed in the 
world). First, note that the gain function g begins to 
climb only after f B , which captures the fact that it 
takes t B time to go from the last patch to a new patch. 
If we draw a line beginning at the origin to any point 
on the gain function, g, then the slope of that line 
will be the overall rate of gain R, as specified in 
equation 1.1. Figure 1.5 shows such a line drawn 
from the origin to a point just tangent to the function 
g. The slope of this line is the optimal rate of gain R* 
as computed in equation 1.2. This can be verified 
graphically by imagining other lines drawn from the 
origin to points on the function g. None of those lines 
will have a steeper slope than the line plotted in 
figure 1.5. The point at which the line is tangent to g 
will be the point at which the rate of gain, g'(f w ) 
within the patch is equal to R*. This point also de- 
termines t*, the optimum time to spend within the 
average patch. 



g(tw) 

tB + tw' 



(1.1) 



or the amount of energy (calories) gained from an 
average patch divided by the time spent traveling 
from one patch to the next (t B ) plus the time spent 
foraging within a patch (t w ). The optimal amount of 
time, t*, to spend in a patch is the one that yields the 
maximum rate of gain, R*, 



fB + f*' 



(1.2) 



Production System Models 

The rational analyses in Information Foraging The- 
ory, which often draw from optimal foraging theory, 
are used to inform the development of production 
system models. These rational analyses make mini- 
mal assumptions about the capabilities of foragers. 
Herbert Simon (1955) argued that organisms are not 
optimal, rational agents having perfect information 
and unlimited computational resources. Rather, or- 
ganisms exhibit bounded rationality. That is, agents 
are rational and adaptive, within the constraints of 
the environment and the psychological machinery 
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available to them biologically. Production system 
models provide a way of specifying the mechanistic 
structures and processes that implement bounded 
rationality. On the one hand, production systems have 
been used in psychology as a particular kind of com- 
puter simulation formalism for specifying the infor- 
mation processing that theorists believe people are 
performing. On the other hand, production systems 
have evolved into something more than just a class of 
computer simulation languages: They have become 
theories about the basic information processing ar- 
chitecture of cognition that is implemented in human 
brains (Anderson, 1983; Anderson & Lebiere, 1998; 
Newell, 1990). 

In general, as used in psychology, 3 production 
systems are composed of a set of production rules that 
specify the dynamics of information processing per- 
formed by cognition (how we think). Production rules 
operate over memories (or databases) that contain sym- 
bolic structures that represent aspects of the external 
environment and internal thought {what we think 
about). The system operates in a cyclical fashion in 
which production rules are selected based on the 
contents of the data memories and then executed. 
The execution of a production rule typically results 
in some change to the memories. 

The production system models presented in this 
book are extensions of ACT theory (Anderson et al., 
2004; Anderson & Lebiere, 1998). ACT (Adaptive 
Control of Thought) theory assumes that there are 
two kinds of knowledge, declarative and procedural 
(Ryle, 1949). Declarative knowledge is the kind of 
knowledge that a person can attend to, reflect upon, 
and usually articulate in some way (e.g., by declaring 
it verbally or by gesture). Declarative knowledge in- 
cludes the kinds of factual knowledge that users can 
verbalize, such as 'T he 'open' item on the 'file' menu 
will open a file." Procedural knowledge is the know- 
how we display in our behavior, without conscious 
awareness. For instance, knowledge of how to ride a 
bike and knowledge of how to point a mouse to a 
menu item are examples of procedural knowl- 
edge. Procedural knowledge specifies how declarative 
knowledge is transformed into active behavior. 

ACT-R (the most recent of the ACT theories) has 
a memory for each kind of knowledge (i.e., a de- 
clarative memory and a procedural memory) plus a 
special goal memory. At any point in time, there may 
be a number of goals in goal memory, but the system 
behavior is focused to achieve just one goal at a time. 
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Complex arrangements of goals and subgoals (e.g., 
for developing and executing plans to find and use 
information) can be implemented by manipulating 
goals in goal memory. 

Production rules (or productions) are used to 
represent procedural knowledge in ACT-R. That is, 
they specify how to apply cognitive skill (know-how) 
and how to retrieve and use declarative knowledge. 
Table 1.1 presents an example of a production sys- 
tem for the task of finding a low-cost hotel using a 
Web site. The example in table 1.1 is not intended 
to be a psychologically plausible model, but rather it 
illustrates key aspects of production system mod- 
els and how they are used in this book. The pro- 
ductions in table 1.1 are English glosses of produc- 
tions written in ACT-R 5.0, which is discussed in 
greater detail below. 4 Each production rule is of the 
form 

IF (condition), THEN (actions). 

The condition of a rule specifies a pattern. When 
the contents of declarative working memory match the 
pattern, the rule may be selected for application. The 
actions of the rule specify additions and deletions of 
content in declarative working memory, as well as 
motor commands. These actions are executed if the 
rule is selected to apply. In ACT-R, each production 
rule has conditions that specify which goal informa- 
tion must be matched and which declarative memory 
must be retrieved. Each production rule has actions 
that specify behavioral actions and possibly the set- 
ting of subgoals. Typically, ACT-R goal memory is 
operated on as what is known in computer science as 
a push-down stack: a kind of memory in which the 
last item stored will be the first item retrieved. Hence, 
storing a new goal is referred to as "pushing a goal on 
the stack," and retrieval is referred to as "popping a 
goal from the stack." 

The production rules in table 1.1 assume that 
declarative memory contains knowledge encoded 
from the external world about the location and con- 
tent of links on a Web page. The productions also 
assume that an initial goal is set to find a hotel price, 
and the productions accomplish the task by "scan- 
ning" through the links keeping track of the lowest 
price found so far. This involves setting a subgoal to 
judge the minimum of the current best price and the 
price just attended when each link is scanned. Table 
1.2 presents a trace of the productions in table 1.1 
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table 1.1 A production system for the task of finding a low hotel price. 
PI: Start 

IF the goal is to find a hotel 

& there is a page of Web results 

& no link location has been processed 
THEN modify the goal to specify that the first location is to be processed 

P2: First-link 

IF the goal is to find a hotel 

& a link location is specified 

& no best price has been noted yet 

& the link at the location indicates a price 

& the link is followed by a link at a new location 
THEN note that the best price is the price from the link at that location 

& modify the goal to specify the new location of the next link 

P3: Next-link 

IF the goal is to find a hotel 

& a link location is specified 

& there is a current best price 

& the link at the location indicates a new price 

& the link is followed by a link at a new location 
THEN create a subgoal to find the minimum of the current price and the new price 

& push the subgoal on the goal stack 

& modify the current goal to specify the new location of the next link 
& note the resulting new minimum price as the best price 

P4: Minimum-price-stay s-the-same 

IF the goal is to find the minimum of the current price and the new price 

& there is a current best price 
& there is a new price 

& the current best price is less than or equal to the new price 
THEN note that the current best price is the minimum 
& pop the subgoal 

PS: New-minimum-price 

IF the goal is to find the minimum of the current price and the new price 

& there is a current best price 
& there is a new price 

& the current best price is greater than the new price 
THEN note that the new price is the minimum 
& pop the subgoal 

P6; Go-do-something-else (Done) 
IF the goal is to find a hotel 

& there is a current best price 
THEN stop 



operating to scan the list of hotel prices depicted in 
figure 1.1 and graphed in figure 1.2. 

Production "PI : Start" in table 1.1 applies at cycle 0 
in table 1.2 when the goal is to find a hotel price. Pro- 
duction "P2: First-link" applies at cycle 1 to scan the 
first link location and set the initial minimum hotel 
price. Then, production "P3: Next-link" applies re- 
peatedly to scan subsequent links (cycles 2-53). For 
each link scanned, P3 sets a subgoal — by creating a new 



goal and making it the focus in goal memory— to 
compare the currently scanned price to the current 
minimum price. This subgoal evokes either production 
"P4: Minimum-price-stays-the-same" or "P5: New- 
minimum-price." When either P4 or P5 applies, it pops 
the subgoal to determine the minimum, and control 
passes back to the top-level goal of finding a hotel price. 

Note in table 1.2 that the trace ends at cycle 52 
with the execution of production "P6: Done" after 
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table 1.2 Trace of the production system specified in table 1.1. 
Cycle 0: Start "~ 
Cycle 1: first-link Location: 1 Link-Price: 1 10 Current-Best: 110 
Cycle 2: next-link Location: 2 Link-Price: 86 Current-Best: 110 

Cycle 3: new-mini mum-price 
Cycle 4: next-link Location: 3 Link-Price: 76 Current-Best: 86 

Cycle 5: new-minimum-price 
Cycle 6: next-link Location: 4 Link-Price: 80 Current-Best: 76 

Cycle 7: minimum-price-stays-same 
Cycle 8: next-link Location: 5 Link-Price: 86 Current-Best: 76 

Cycle 9: minimum-price-stays-same 
Cycle 10: next-link Location: 6 Link-Price: 76 Current-Best: 76 

Cycle 11: minimum-price-stays-same 
Cycle 12: next-link Location: 7 Link-Price: 96 Current-Best: 76 

Cycle 13: minimum-price-stays-same 
Cycle 14: next-link Location: 8 Link-Price: 1 10 Current-Best: 76 

Cycle 15: minimum-price-stays-same 
Cycle 16: next-link Location: 9 Link-Price: 86 Current-Best: 76 

Cycle 17: minimum-price-stays-same 
Cycle 18: next-link Location: 10 Link-Price: 96 Current-Best: 76 

Cycle 19: minimum-price-stays-same 
Cycle 20: next-link Location: 1 1 Link-Price: 1 10 Current-Best: 76 

Cycle 21: minimum-price-stays-same 
Cycle 22: next-link Location: 12 Link-Price: 86 Current-Best- 76 

Cycle 23: 

mimmum-pnee-stays-same 
Cycle 24: next-link Location: 13 Link-Price: 86 Current-Best: 76 

Cycle 25: minimum-price-stays-same 
Cycle 26: next-link Location: 14 Link-Price: 76 Current-Best: 76 

Cycle 27: minimum-price-stays-same 
Cycle 28: next-link Location: 15 Link-Price: 90 Current-Best: 76 

Cycle 29: minimum-price-stays-same 
Cycle 30: next-link Location: 16 Link-Price: 76 Current-Best: 76 

Cycle 31: minimum-price-stays-same 
Cycle 32: next-link Location: 17 Link-Price: 1 30 Current-Best: 76 

Cycle 33: minimum-price-stays-same 
Cycle 34: next-link Location: 18 Link-Price: 86 Current-Best: 76 

Cycle 35: minimum-price-stays-same 
Cycle 36: next-link Location: 19 Link-Price: 98 Current-Best- 76 

Cycle 37: 

mmimum-price-stays-same 
Cycle 38: next-link Location: 20 Link-Price: 86 Current-Best: 76 

Cycle 39: minimum-price-stays-same 
Cycle 40: next-link Location: 21 Link-Price: 1 20 Current-Best: 76 

Cycle 41: minimum-price-stays-same 
Cycle 42: next-link Location: 22 Link-Price: 80 Current-Best: 76 

Cycle 43: minimum-price-stays-same 
Cycle 44: next-link Location: 23 Link-Price: 80 Current-Best: 76 

Cycle 45: minimum-price-stays-same 
Cycle 46: next-link Location: 24 Link-Price: 100 Current-Best: 76 

Cycle 47: minimum-price-stays-same 
Cycle 48: next-link Location: 25 Link-Price: 86 Current-Best: 76 

Cycle 49: minimum-price-stays-same 
Cycle 50: next-link Location: 26 Link-Price: 66 Current-Best: 76 

Cycle 51: new-mini mum-price 
Cycle 52: DONE!!! Best price is: 66 
Total Time: 782.30005 sec 



11 
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scanning the link at location 26 in the list of results. 
The list actually contains 44 links in the result list 
(figure 1.2). The production system stops at link lo- 
cation 26 because of the way it implements elements 
of the rational analysis described above. Productions 
"P3: Next-link" and U P6: Done" match very similar 
patterns in declarative memory. In fact, on every 
cycle that P3 or P6 fires in the trace, the other pro- 
duction also matches. In production system termi- 
nology, P3 and P6 form a conflict set when on a 
particular cycle they both match the current pattern 
in the goal stack and declarative memory. In such 
cases, the utility of each production in the conflict set 
is evaluated and used to perform conflict resolution to 
determine which production to execute. 

Production "P6: Done" is associated with a utility 
that corresponds to R discussed above: the overall rate 
of gain. I simply assumed that this corresponds to 
how the production system values its time. For the 
trace in table 1.2, I assumed that the production 
system valued its time at R = $10/hour. 

Production "P3: Next-link" is associated with a 
utility that corresponds to g'(t) discussed above: the 
rate of savings that would be achieved by looking at 
the next link: expected savings from scanning next 
link/time to scan link (in hours). The appendix dis- 
cusses how expected savings is computed assuming 
the distribution of hotel prices evident in figure 1.2. 
From self-observation, I noted that it took 30 sec (30/ 
3600 hour) to scan a link on the Web site depicted in 
figure 1.1. The competition between productions P3 
and P6 implements the key idea of Charnovs Mar- 
ginal Value Theorem: As long as the rate of savings 
expected for production "P3: Next-link" is greater 
than the overall rate of gain, R, associated with "P6: 
Done," then the system continues to scan links; 
otherwise, it quits. 

Summary 

I have presented this simple concrete example to 
sketch out the overall framework and approach of 
Information Foraging Theory before beginning more 
abstract discussion of framework and method. At this 
preliminary stage, it was necessary to gloss over unre- 
alistic assumptions about Web use and the technical 
details of the analysis and model. However, it is im- 
portant to point out two realistic aspects of the 
example. First, as will become clear in chapter 3, the 
Web does have a patchy structure (e.g., Web sites and 



search results), and diminishing returns within those 
information patches is common. For instance, figure 
1.6 is based on data from a study of medical infor- 
mation seeking (Bhavnani, 2005). 5 Bhavnani, Jacob, 
Nardine, and Peck (2003) asked melanoma experts 
to identify melanoma risk facts that they identified as 
important for a melanoma patient to understand. 
Figure 1.6a shows the distribution of melanoma risk 
facts across Web pages. Very few pages contain all 
14 expert-identified melanoma risk concepts, but 
many contain one of the melanoma risk facts. Figure 
1.6b is an estimate of the number of melanoma risk 
facts that a user would encounter as a function of 
visits to melanoma-related pages (Bhavnani et al., 
2003). Note that it is a diminishing returns curve 
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figurk 1.6 (a) The distribution of number of key 
concepts about melanoma risk across Web pages, 
and (b) the cumulative number of key concepts en- 
countered as a function of size of sample of pages 
(Bhavnani, 2005; Bhavnani et al., 2003). 
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and that the user is expected to require 25 page visits 
to find all expert-identified melanoma risk facts. 

In the remaining sections of this chapter, I pro- 
vide an overview of broader framework and method. 
The remainder of this book is about the empirical 
and theoretical details. 

Man the Informavore 

All men by nature desire knowledge.— Aristotle, 
Metaphysics 

The human propensity to gather and use information 
to adapt to everyday problems in the world is a core 
piece of human psychology that has been largely ig- 
nored in cognitive studies. George A. Miller (1983), 
however, recognized the centrality of this human 
propensity to our cognitive natures and argued that 
mankind might fruitfully be viewed as a kind of in- 
formavore: a species that hungers for information in 
order to gather it and store it as a means for adapting 
to the world. Picking up on this idea, Dennett (1991) 
traced out a plausible evolutionary history in which 
he suggested that our ancestors might have developed 
vigilance behaviors that required surveying and as- 
sessing the current state of the environment, much 
like the prairie dogs who pop up on two feet to per- 
form their situation appraisals or the harbor seals that 
break the surface in the middle of a beach break to 
check out whether the surfers are friends, foe, or prey. 
Adaptive pressures to gain more useful, actionable 
knowledge from the environment could lead to the 
marshaling of available cognitive and behavioral ma- 
chinery, resulting in organisms, such as primates, that 
have active curiosity about the world and themselves. 
Humans, of course, are extreme in their reliance on 
information, with language and culture, and now 
modern technology, providing media for transmis- 
sion within and across generations. Humans are the 
Informavores rex of the current era. 

George Miller's notion of humans as informavores 
suggests that our genes have bestowed upon us an 
evolving behavioral repertoire that now includes the 
technological aspects of our culture associated with 
finding, saving, and communicating information. It is 
common in evolutionary discussions to distinguish be- 
tween genotype and phenotype (Johanssen, 191 1). The 
genotype is the blueprint for an individual. What gets 
passed from one generation to the next (if it survives 
and reproduces) are the genotypic blueprints. Phe- 
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notypes are the outward manifestation of the geno- 
type. Typically, people think of this as the bodily 
structure and behavior of the individual organism. 
However, Dawkins (1989) introduced the notion of 
extended phenotype to clarify the observation that 
the genotype has extended effects on the world at 
large that go beyond the actual body and behavior of 
the individual. Not only do beavers have tails, but they 
use them to make dams. Not only do spiders have legs, 
but they use them to make webs. Humans have not 
only brains but also external technology for storing 
information, and information foraging strategies that 
can be invoked to call forth the right knowledge, at 
the right time, to take useful action. It remains an 
open question as to why humans have evolved such 
information collection strategies— a question that 1 
raise again at the end of this book. 



The Adaptive Pressure of the Wealth 
of Information 

Thanks to science and technology, access to factual 
knowledge of all kinds is rising exponentially while drop- 
ping in unit cost We are drowning in information, 

while starving for wisdom. — £. O. Wilson, Consilience 

Information Foraging Theory emerges from a serious 
consideration of Miller's notion of informavores. A 
serious consideration of the concept leads to ques- 
tions regarding the adaptive forces that drive human 
interaction with information. Simon (1971) articu- 
lated the basic design problem facing us: "What in- 
formation consumes is rather obvious: it consumes 
the attention of its recipients. Hence a wealth of in- 
formation creates a poverty of attention, and a need 
to allocate that attention efficiently among the over- 
abundance of information sources that might con- 
sume it" (pp. 40-41). 

According to statistics compiled by the University 
of California-Berkeley School of Information Sci- 
ence (Lyman & Varian, 2003), almost 800 megabytes 
of recorded information are produced per person per 
year, averaged over the estimated 6.3 billion people 
in the world. This is the equivalent of about 30 linear 
feet of books. In an information-rich world, the real 
design problem to be solved is not so much how to 
collect and distribute more information but rather 
how to increase the rate at which persons can find 
and attend to information that is truly of value to 
them. 
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14 INFORMATION FORAGING THEORY 

The Principle of the Extremization 
of Information Utility as a Function 
of Interaction Cost 

An investment in knowledge always pays the best 
i nterest. — Benjamin Fra n kl in 

In modern society, people interact with information 
through technology that more or less helps them find 
and use the right knowledge at the right time. In 
evolutionary terms, one can argue that increasing the 
rate of gain of valuable information increases fitness. 
As Sir Francis Bacon observed, "knowledge is power." 
Power (control over the world to achieve one's goals) 
can be improved by better knowledge, or lower costs 
of access and application of knowledge. In evolu- 
tionary terms, an agent's fitness is improved to the 
extent that it can predict and control the environ- 
ment in order to solve the problems it faces in every- 
day life. In psychological terms, increasing the rate at 
which people can find, make sense of, and use valu- 
able information improves the human capacity to 
behave intelligently. We should expect adaptive sys- 
tems to evolve toward states that maximize gains of 
valuable information per unit cost (Resnikoff, 1989, 
p. 97). A useful way of thinking about such adapta- 
tion is to say that 

Human-information interaction systems will tend 
to maximize the value of external knowledge 
gained relative to the cost of interaction. 

Schematically, we may characterize this maximiza- 
tion tendency 6 as 

Fxpected value of knowledge gained 
— 2-2 n 3) 

Cost of interaction 

Cognitive systems engaged in information foraging 
will exhibit such adaptive tendencies, and they will 
prefer technologies that tend to maximize the value 
(or utility) of knowledge gained per unit cost of in- 
teraction. For instance, sensory systems appear to 
evolve in ways that deliver more bits of information for 
the amount of calories expended. Similarly, offices, 
with their seeming chaotic mess of piles of papers, 
books, and files, appear to become organized in ways 
that optimize access costs of frequently needed infor- 
mation (Case, 1991; Malone, 1983; Soper, 1976). 
Resnikoff (1989, pp. 112-117) presented a mathe- 
matical analysis suggesting that physical library cata- 



log card systems would become arranged in ways that 
minimized manual search time. Information Forag- 
ing Theory assumes that people prefer information- 
seeking strategies that yield more useful information 
per unit cost. People tend to arrange their environ- 
ments (physical or virtual) to optimize this rate of gain. 
People prefer, and consequently select, technology 
designs that improve returns on information foraging. 

The Exaptation of Food Foraging Mechanisms 

Natural selection favored organisms— including our 
human ancestors — that had better mechanisms for 
extracting energy from the environment and translat- 
ing that energy into reproductive success. Organisms 
with better food-foraging strategies (for their particular 
environment) were favored by natural selection. Our 
ancestors evolved perceptual and cognitive mecha- 
nisms and strategies that were very well adapted to the 
task of exploring the environment and finding and 
gathering food. Information Foraging Theory assumes 
that modern-day information foragers use perceptual 
and cognitive mechanisms that carry over from the 
evolution of food-foraging adaptations. 

If information foraging is like food foraging, then 
models of optimal foraging developed in the study of 
animal behavior (Stephens & Krcbs, 1986) and an- 
thropology (Wintcrhalder & Smith, 1992) should be 
relevant. Figure L5 presents the conventional patch 
model and Charnov s Marginal Value Theorem as a 
possible analog for information foraging at a Web 
site. A typical optimal foraging model characterizes 
an agent's interaction with the environment as an 
optimal solution to the tradeoff of costs of finding, 
choosing, and handling food against the energetic 
benefit gained from that food. These models would 
look very familiar to an engineer because they are 
basically an attempt to understand the design of an 
agent's behavior by assuming that it is well engi- 
neered (adapted) for the problems posed by the en- 
vironment. Information foraging models include 
optimality analyses of different information-seeking 
strategies and technologies as a way of understanding 
the design rationale for user strategies and interaction 
technologies. 

Optimal foraging theorists assume that energy, orig- 
inating predominantly from the sun, seeps through the 
food chain to be deposited in various plants and ani- 
mals that are distributed variably through the envi- 
ronment. Food foragers may have different mecha- 
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nisms and strategies available to them for navigating 
through the environment. Their potential sources of 
food may have different prevalences in different hab- 
itats and may have different profitabilities in terms of 
how many calories can be extracted when foraged. 
The optimal forager is one who has the strategies, 
mechanisms, diets, and so forth, that maximize the 
calories gained per unit of effort expended. 7 Similarly, 
Information Foraging Theory assumes that informa- 
tion comes to be stored in various prevalences in dif- 
ferent kinds of repositories, in various forms and 
media. The information forager has different means 
available for navigating and searching the information 
environment, and different information sources have 
different profitabilities in terms of the interaction cost 
required to gain useful information. As suggested by 
equation 1.3, the optimal information forager is one 
who maximizes the value of knowledge gained per 
unit cost of interaction. 

Application to Human-Information 
Interaction 

The legacy of the Enlightenment is the belief that en- 
tirely on our own we can know, and in knowing, under- 
stand, and in understanding, choose wisely. — E. O. Wilson, 
Consilience 

Human-information interaction (HII) is a nascent 
field that is concerned with how people interact with, 
and process, outwardly accessible information in ser- 
vice of their goals. 8 It adopts an information-centric 
approach rather than the computer-centric approach 
of the field of human-computer interaction (HCI) 
(Lucas, 2000). This shift to an information-centric 
focus is a natural evolution for the field of HCI be- 
cause of the increasing pervasiveness of information 
services, the increasing transparency of user inter- 
faces, the convergence of information delivery tech- 
nologies, and the trend toward ubiquitous computing. 

Access to the Internet is pervasive in the developed 
world through land lines, satellite, cable, and mobile 
devices. The field of HCI, over the past two decades 
and more, has led to the development of computers 
and computer applications that are transparent to 
users performing their tasks. In parallel, the business 
world around consumer media technologies shows 
excitement over the convergence of television, cell 
phones, personal computers, PDAs (personal digital 
assistants), cars, set-tops, and other consumer elec- 
tronics devices, as well as the convergence among the 
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means for transporting information, such as the In- 
ternet, radio, satellite, and cable. Research on ubiqui- 
tous computing looks forward to a world in which 
computational devices are basically everywhere in our 
homes, mobile devices, cars, and so on, and these 
devices can be marshaled to perform arbitrary tasks for 
users. The net effect of these trends is to make comput- 
ers invisible, just as electricity and electric motors are 
invisible in homes today (Lucas, 2000). As computers 
become invisible, and information becomes ample 
and pervasive, we expect to see a shift in studies from 
HCI to HII. Rather than focus on the structure of de- 
vices and application programs, the focus of HII re- 
search must center on content and interactive media. 

Information Foraging Theory arose during the 
1990s, coinciding with an explosion in the amount of 
information that became available to the average 
computer user and with the development of new 
technologies for accessing and interacting with infor- 
mation. The late 1980s witnessed several strands of 
HCI research that were devoted to ameliorating prob- 
lems of exploring and finding electronically stored 
information. It had become apparent that users could 
no longer remember the names of all their electronic 
files, and it was even more difficult for them to guess 
the names of files stored by others (Furnas, Landauer, 
Gomez, & Dumais, 1987). One can see proposals in 
the mid- to late 1980s HCI literature for methods to 
enhance users' ability to search and explore external 
memory. Jones (1986) proposed the Memory Ex- 
tender (ME), which used a model of human associa- 
tive memory (Anderson, 1983) to automatically re- 
trieve files represented by sets of keywords that were 
similar to the sets of keywords representing the users' 
working context. Latent Semantic Analysis (LSA; 
Dumais, Furnas, Landauer, Deerwester, & Harsh- 
man, 1988) was developed to mimic human ability to 
detect deeper semantic associations among words, 
such as "dog" and "cat," to similarly enhance infor- 
mation retrieval. Interestingly, the work on ME and 
LSA was contrasted with work in the "traditional" field 
of information retrieval in computer science, which 
had a relatively long history of developing automated 
systems for storing and retrieving text documents. The 
CHI '88 conference where LSA was introduced also 
hosted a panel bemoaning the fact that automated 
information retrieval systems had not progressed to the 
stage where anyone but dedicated experts could op- 
crate them (Borgman, Belkin, Croft, Lesk, & Land- 
auer, 1988). Such systems, however, were the direct 
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16 INFORMATION FORAGING THEORY 

ancestors of modern search engines found on the 
World Wide Web. 

Hypermedia also became a hot topic during the 
late 1980s, with Apple's introduction of HyperCard 
in 1987, the first ACM Conference on Hypertext in 
1987, and a paper session at the CHI '88 conference. 
The very idea of hypertext can be traced back to 
Vannevar Bush's Atlantic Monthly article, "As We 
May Think," published in 1945. Worried about schol- 
ars becoming overwhelmed by the amount of infor- 
mation being published, Bush proposed a mechanized 
private file system, called the Memex, that would 
augment the memory of the individual user. It was 
explicitly intended to mimic human associative 
memory. Bush's article influenced the development of 
Douglas Engelbart's NLS (oNLine System), which 
was introduced to the world in a tour-de-force dem- 
onstration at the 1968 Fall Joint Computer Confer- 
ence. The demonstration of NLS — a system explicitly 
designed to "augment human intellect" (Engelbart, 
1962) — also introduced the world to the power of 
networking, the mouse, and point-and-click interac- 
tion. Hypertext and hypermedia research arose during 
the late 1980s because personal computing power, 
networking, and user interfaces had evolved to the 
point where the visions of Bush and Engelbart could 
finally be realized for the average computer user. 

The confluence of increased computing power, 
storage, networking and information access, and hy- 
pennedia research in the late 1980s set the stage for the 
widespread deployment of hypermedia in the form 
of the World Wide Web. In 1989, Tim Berners-Lcc 
(1989) proposed a solution to the problems that were 
being faced by the CERN community in dealing with 
distributed collections of documents, which were stored 
on many types of platforms, in many types of formats. 
This proposal led directly to the development of 
HTML, HTTP, and, in 1990, the release of the World 
Wide Web. Berncrs-Lee's vision was not only to provide 
users with more effective access to information but 
also to initiate an evolving web of information that re- 
flected and enhanced the community and its activities. 

The emergence of the Web in the 1990s provided 
new challenges and opportunities for HCI. The in- 
creased wealth of accessible content, and the use of 
the Web as a place to do business, exacerbated the 
need to improve the user experience on the Web. 

The usability literature that has evolved sur- 
rounding the Web user experience is incredibly rich 
with design principles and maxims (Nielsen, 2000; 



Spool, Scanlon, Schroeder, Snyder, & DeAngelo, 
1999), the most important of which is to test designs 
with users. Much of this literature is based on a mix of 
empirical findings and expert ("guru") opinion. A 
good deal of it is conflicting. The development of 
theory in this area can greatly accelerate progress and 
meet the demands of changes in the way we interact 
with the Web. Greater theoretical understanding and 
the ability to predict the effects of alternative designs 
could bring greater coherence to the usability litera- 
ture and provide more rapid evolution of better 
designs. In practical terms, a designer armed with 
such theory could explore and explain the effects of 
different design decisions on Web designs before the 
heavy investment of resources for implementation and 
testing. This exploration of design space is also more 
efficient because the choices among different design 
alternatives are better informed: Rather than ran* 
domly generating and testing design alternatives, the 
designer is in a position to know which avenues are 
better to explore and which are better to ignore. Un- 
fortunately, cognitive engineering models that have 
been developed to deal with the analysis of expert 
performance on well-defined tasks involving applica- 
tion programs (Pirolli, 1999) have little applicability to 
understanding foraging through content-rich hyper- 
media, and consequently new theories are needed. 

Methodological Adaptationism 

Adaptationist reasoning is not optional; it is the heart and 
soul of evolutionary biology. — D. C. Dennett, Darwin's 
Dangerous Idea 

The concept of informavores, and concern with the 
application domain of HII, leads us to reconsider the 
dominance of strictly mechanistic analyses of HCI. 
Miller, in his 1983 article about "informavores," com- 
mented on the incompleteness of the mechanistic 
approach by using the following analogy: 

Insofar as a limb is a lever, the theory of levers 
describes its behavior— but a theory of levers does 
not answer every question that might be asked 
about the structure and function of the limbs of 
animals. Insofar as the mind is used to process 
information, the theory of information processing 
describes its behavior— but a theory of informa- 
tion processing does not answer every question 
that might be asked about the structure and func- 
tion of the minds of human beings, (p. 112) 
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Information processing (mechanistic) analyses of 
HCI— by themselves— give only partial explanations. 
They provide mechanistic explanations of the "le- 
vers" of the mind. In reaction to this inadequacy, 
Information Foraging Theory has been guided by the 
heuristics and explanatory framework of methodo- 
logical adaptationism y and the specific version of it 
developed by Anderson (1990) called rational anal- 
ysis (see also Oaksford & Chater, 1998). The illus- 
tration above concerning hotel prices on the Web 
involved a very simple rational analysis. Methodo- 
logical adaptationism presumes that it is a good 
heuristic for scientists to assume that evolving, behav- 
ing systems are rational, or well designed, for fulfilling 
certain functions in certain environments. There is an 
assumption of ecological rationality regarding the 
behavior of the system being observed (Bechtcl, 1985; 
Dennett, 1983, 1988, 1995; Gigerenzer, 2000). The 
adaptationist approach involves a kind of reverse en- 
gineering in which the analyst asks (a) what envi- 
ronmental problem is solved, (b) why is a given sys- 
tem a good solution to the problem, and (c) how is 
that solution realized (approximated) by mechanism. 

Versions of methodological adaptationism have 
shaped research programs in behavioral ecology (e.g., 
Mayr, 1983; Stephens & Krebs, 1986; Tinbcrgcn, 
1963), anthropology (e.g., Winterhalder & Smith, 
1992), and neuroscience (e.g., Glimcher, 2003). The 
approach gained currency in cognitive science during 
the 1980s as a reaction to ad hoc models of how people 
performed complex cognitive or perceptual tasks. At 
that time, models of cognition and perception were 
generally mechanistic, detailing perceptual and cog- 
nitive structures and the processes that transformed 
them. The Model Human Processor (MHP) and 
GOMS (Goals, Operators, Methods, and Selection 
rules; Card, Moran, & Newell, 1983) are cognitive 
engineering examples in the field of HCI that derive 
from this approach. The MHP specifies a basic set of 
information storage and processing machinery, much 
like a specification of the basic computer architecture 
for a personal computer. GOMS specifies basic task 
performance processes, much like a mechanical pro- 
gram that "runs" on the MHP. 

Around the same time that GOMS and MHP 
were introduced into HCI, there emerged a concern 
among cognitive scientists that mechanistic infor- 
mation processing models, by themselves, were not 
enough to understand the human mind (Anderson, 
1990; Marr, 1982). A major worry was that mecha- 
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nistic models of cognition had been developed in an 
ad hoc way and provided an incomplete explanation 
of human behavior. It had become common practice 
to cobble together a program that simulated human 
performance on some task and then claim that the 
program was in fact a theory of the task (Marr, 1982, 
p. 28). Anderson (1990) lamented that cognitive mod- 
elers "pull out of an infinite grab bag of mechanisms 
bizarre creations whose only justification is that they 

predict the phenomena in a class of experiments 

We almost never ask the question of why these 
mechanisms compute the way they do" (p. 7, em- 
phasis added). 

Figuring out a mechanistic account of human 
behavior— for instance, with MHP analysis — is no 
small feat. However, as the Miller quote above suggests, 
such accounts do not explain everything. The mind is 
not just any old arbitrary, cobbled-together machine; 
rather, it is a fantastically complex machine that has 
been designed by evolution to be well tailored to the 
demands of surviving and reproducing in the envi- 
ronment. The adaptationist approach recognizes that 
one can better understand a machine by understand- 
ing its function. By this I mean both that (a) adapta- 
tionist accounts make more sense and (b) the search 
for better understanding proceeds at a faster pace. 

Levels of Explanation 

The analysis of people interacting with information 
involves interrelated layers of explanation. This is 
because scientific models in this area assume that 
human activity is (a) purposeful and adaptive, which 
requires a kind of rational analysis, (b) based on knowl- 
edge, (c) computed by information processing mech- 
anisms, which are (d) realized by physical, biological, 
processes. Table 1.3 presents a summary of the rele- 
vant framework that has emerged in the behav- 
ioral sciences (see, e.g., Anderson, 1990; Cosmides, 
Tooby, & Barow, 1992; Gigerenzer, 2000; Winter- 
halder & Smith, 1992a). 

Rational analysis, in the case of Information 
Foraging Theory, focuses on the task environment 
that is the aim of performance, the information en- 
vironment that structures access to valuable knowl- 
edge, and the adaptive fit of the HII system to the 
demands of these environments. Rational analysis 
assumes that the structure of behavior can be un- 
derstood in terms of its adaptive fit to the structure 
and constraints of the environment. The analysis of 
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TABLE 1.3 


Levels of explanation. 








Level 


Question 


Stance 


Analysis Elements 


Examples 


Rational 


What environmental 
problem is solved? 

Why is this solution 
a good one? 


Design 


• States, resources, 
state dynamics 

• Constraints, 
affordances 

• Feasible strategies 

• Optimization criteria 


• Optima] foraging 
theory 

• Information 
Foraging Theory 


Knowledge 


What does the system 
know? 


Intentional 


• Environment 

• Goals, preferences 

• Knowledge 

• Perception, action 


• Knowledge-level 
analysis 


Cognitive 


How does the 
system do it? 


Information 
processing 


• Cognitive states 

• Cognitive processes 


• ACT-R 

• Soar 


Biological 


How docs the system 
physically do it? 


Biophysical 


• Neural processes 


• Neural models 



searching for hotel prices on Web involved a rational 
analysis of the expected savings to be gained from 
information search and an analysis of the rational 
choice to make when faced with decisions of whether 
to continue search or to give up. When performing a 
rational analysis the theorist may be said to take a 
design stance (Dennett, 1995) that focuses on an 
analysis of the functionality of the system with respect 
to its ostensive purpose. At this level, the analyst acts 
most purely as an engineer concerned with why users' 
behavior is rational given the task context in which it 
occurs, and it is assumed that users are optimizing 
their performance in achieving their goals. 

Knowledge-level analysis concerns the knowledge 
content involved in achieving goals. Knowledge-level 
analysis involves descriptions of a system in inten- 
tional terms with the assumption that behavior is the 
product of purposes, preferences, and knowledge. 
The knowledge level has been important in artificial 
intelligence since its introduction by Newell (1982). 
A knowledge-level analysis of the task of searching for 
hotel prices on the Web was a prerequisite to the 
specification of the production rules and chunks in- 
volved in the cognitive simulation. Dennett (1988) 
defined an observer who describes a system using 
an intentional vocabulary (e.g., "know," "believe," 
"think") as one taking an intentional stance. Typi- 
cally, a task analysis focuses mainly on an analysis of 
users' knowledge, preferences, perceptions, and ac- 
tions, with respect to the goal and environment. At 



this level of analysis, it is assumed that users deploy 
their knowledge to achieve their goals, and the focus 
is on identifying what knowledge is involved. 

Modern cognitive psychology assumes that the 
knowledge level can be given a scientific account 
(i.e., be made predictable) by explaining it in terms 
of mechanistic information processing (Newell, 
1990). This is the cognitive level of explanation. This 
level of analysis focuses on the properties of the in- 
formation processing machinery that evolution has 
dealt to humans to perceive, think, remember, learn, 
and act in what wc would call purposeful and knowl- 
edgeable ways. This is the level of most traditional 
theorizing in cognitive psychology and HCI — the 
level at which computational models may, in prin- 
ciple, be developed to simulate human cognition. 
GOMS (Card et ah, 1983), described above, is an 
example of an analysis method aimed at cognitive- 
level analysis. Cognitive architectures such as ACT-R 
(Anderson et al, 2004) or Soar (Newell, 1990) and 
the simulations developed in those architectures are 
developed at the cognitive level. The production 
system specified in table 1.1 was a simple example of 
a cognitive-level analysis. 

Accounts at the cognitive level are assumed to 
be instantiated at the biological level by the physi- 
cal machinery of the brain and body. The biological 
level of explanation specifies the proximal physical 
mechanisms underlying behavior. For instance, An- 
derson et al. (2004) have recently presented results 
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suggesting the mapping of the ACT-R architecture 
onto neural structure and functioning. 

Phenomena at Different Time Scales 
of Behavioral Analysis 

Many of our goals can drive our behavior for days, 
months, and even years. These longer term goals 
are typically realized by task structures composed of 
many shorter term goals. Card et al. (1983) suggested 
that there is a base level of tasks, called the unit task 
level, that controls immediate behavior. Unit tasks 
empirically take about 10 seconds. To an approxi- 
mation, unit tasks are where "the rational rubber 
meets the mechanistic road." To an approximation, 
the structure of behavior above the unit task level 
largely reflects a rational structuring of the task within 
the constraints of the environment, whereas the struc- 
ture within and below the unit task level reflects 
cognitive and biological mechanisms. Phenomena 
occur at multiple grain sizes of time, and effects 
propagate in both upward and downward directions: 
Rational/ecological structuring goes downward from 
longer time scales of phenomena, and environment 
and proximal mechanism constraints go upward. A 
significant claim of the framework adopted by In- 
formation Foraging Theory from Newell (1990) and 
Anderson (2002) is that the phenomena of human 
cognition can be decomposed and modeled at many 
different time scales. 

Newell (Newell, 1990; Newell & Card, 1985) 
argued that human behavior arises from a hierarchi- 
cally organized system in which the basic time scale 
of operation of each system level increases by a factor 
of 10 as one moves up the hierarchy (table 1.4). The 
phenomena at each band in table 1.4 are largely 
dominated by different kinds of factors. Behavioral 
analysis at the biological band (approximately milli- 
seconds to tens of milliseconds) is dominated by 
biochemical, biophysical, and especially neural pro- 
cesses, such as the time it takes for a neuron to fire. 
The psychological band of activity (approximately 
hundreds of milliseconds to tens of seconds) has 
been the main preoccupation of cognitive psychology 
(Anderson, 1983, 1993; Newell, 1990). At this time 
scale, it is assumed that elementary cognitive mech- 
anisms play a major part in shaping behavior. The 
typical unit of analysis is a single response function, 
involving a perceptual input stage, a cognitive stage, 
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table 1.4 Time scale on which human action 
occurs. 



Scale 



(seconds) Time Unit Band 



10 7 
10 6 
II) 5 


Months 

Weeks 

Days 


Social 


I0 4 
10 5 
10 2 


Hours 

10 minutes 

Minutes 


Rational 


10' 
10° 
10-' 


10 seconds 

1 second 

100 milliseconds 


Cognitive 


io- 2 


1 millisecond 


Biological 


Different bands are 


quite different phenomenolog 


ical worlds. 


Adapted from Newell (1990, p. 122). 





and a stage of action output— for instance, finding a 
word in the menu of a text editor and moving a mouse 
to select the menu item. The mechanisms involved at 
this level of analysis include elementary information 
processing functions such as memory storage and 
retrieval, recognition, categorization, comparison of 
one information element to another, and choosing 
among alternative actions. 

As the time scale of activity increases, "there will be 
a shift towards characterizing a system . . . without re- 
gard to the way in which the internal processing ac- 
complishes the linking of action to goals" (Newell, 
1990, p. 1 50). This is the rational band of phenomena 
(minutes to days). The typical unit of analysis at this 
level is the task, which is defined, in part, by a goal. It is 
assumed that an intelligent agent will have preferences 
for actions that it perceives to be applicable in its en- 
vironment and that it knows will move the current 
situation toward the goal. So, on the one hand, goals, 
knowledge, perceptions, actions, and preferences 
shape behavior. On the other hand, the structure, 
constraints, and resources of the environment in 
which the task takes place — called the task environ- 
ment (Newell & Simon, 1972) -will also greatly 
shape behavior. Explanations at the rational band as- 
sume that behavior is governed by rational principles 
and that it is largely shaped by the structure and con- 
straints of the task environment, although it is also 
realized that people are not infinitely and perfectly 
rational (Simon, 1955). The rationale for behavior at 
this level is its adaptive fit to its task environment. 
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Task Environments and 
Information Environments 

To understand information foraging requires analysis 
of the environment in addition to analysis of the for- 
ager. The importance of the analysis of the environ- 
ment to psychology was a more general point made by 
Brunswik (1952) and Simon (1981). It is useful to 
think of two interrelated environments in which an 
information forager operates: the task environment and 
the information environment. The classical definition 
of the task environment is that it "refers to an envi- 
ronment coupled with a goal, problem or task— the 
one for which the motivation of the subject is as- 
sumed. It is the task that defines a point of view about 
the environment, and that, in fact allows an environ- 
ment to be delimited" (Newell & Simon, 1972, p. 55). 
The task environment is the scientist's analysis of those 
aspects of the physical, social, virtual, and cognitive 
environments that drive human behavior. 

The information environment is a tributary of 
knowledge that permits people to more adaptively 
engage their task environments. Most of the tasks that 
we identify as significant problems in our everyday life 
require that we get more knowledge — become better 
informed— before taking action. What we know, or do 
not know, affects how well we function in the impor- 
tant task environments that we face in life. External 
content provides the means for expanding and im- 
proving our abilities. The information environment, 
in turn, structures our interactions with this content. 
Our particular analytic viewpoint on the information 
environment will be determined by the information 
needs that arise from the embedding task environ- 
ment. From the standpoint of a psychological analysis, 
the information environment is delimited and defined 
in relation to the task environment. 

Problem Spaces 

A large class of tasks may be understood as variations on 
problem solving. Indeed, Newell (1990) essentially 
argued that all of cognition could be understand by 
taking this stance. Newell and Simon (1972) charac- 
terized problem solving formally as a process of search 
through a problem space. A problem space consists of 
an initial situation called the start state and some de- 
sired situation called the goal state. Other situations 
that may occur while solving the problem are inter- 
mediate states. Problem-solving operators (e.g., actions 



performed by the problem solver) transform problem 
states. For instance, the problem faced by a toddler 
seeking to eat cookies from a cupboard may have an 
initial state that consists of the child standing on the 
floor and a chair some distance away, and the child 
may apply problem-solving operators such as moving 
the chair, climbing on the chair, and opening the 
cupboard to transform the initial state toward the goal 
state. The various states that can be achieved are re- 
ferred to as a problem space (or sometimes a state 
space). Often, any given problem state is a situation 
that affords many possible actions (operators). In such 
cases, each state branches to many possible subsequent 
states, with each branch in each path corresponding to 
the application of an operator. The problem is to find 
some path through the maze of possible states. Finding 
this path is a process of search through a problem space. 

Ill-Structured Problems and 
Knowledge Search 

Well-structured problems, such as puzzles and games, 
have well-defined initial states, goal states, operators, 
and other problem constraints, which contrasts with 
the ill-structured problems. Ill-structured problems, 
such as choosing a medical treatment or buying a 
house, typically require additional knowledge from 
external sources in order to better understand the 
starting state, to better define a goal, or to specify the 
actions that are afforded at any given state (Simon, 
1973). People typically need to perform knowledge 
search in order to solve their ill-structured problems 
(e.g., to define aspects of a problem space that permit 
effective or efficient problem space search). The in- 
formation environment is a potential source of valu- 
able knowledge that can improve our ability to achieve 
our goals, especially when they involve ill-structured 
tasks. More generally, knowledge shapes human func- 
tionality, and consequently external access to large 
volumes of widely variegated knowledge may improve 
our range of adaptation because we can solve more 
problems, or solve problems using better approaches. 

Knowledge-Level Systems 

Knowledge, if it does not determine action, is dead 
to us. — Plotinus 

Externally available content provides us with knowl- 
edge valuable to the achievement of our goals. Given 
the central role of external knowledge to Informa- 



31 354_C01_UNCORRECTED_PROOF.3d_21_01 -02-07 



tion Foraging Theory, it is useful to review NewelFs 
(1982) influential framework for the study of knowl- 
edge systems. This provides a way of characteriz- 
ing adaptation in terms of knowledge content. This 
framework, which arises from the cognitive sciences, 
assumes that knowledge shapes the functionality of 
our cognitive abilities and that intelligent behavior 
depends on finding and using the right knowledge 
at the right time. This framework was largely articu- 
lated by Allen Newell (1982, 1990, 1993) and Daniel 
Dennett (1988, 1991). Traditionally (e.g., Dennett, 
1988; Newell, 1990), the information processing 
system under consideration for analysis is an unaided 
person or computer program working in some task 
environment. However, we can extend the approach 
to understand a system that consists of a person tightly 
coupled with technological support and access to a 
teeming world of information. 

Over the course of 20 years, Newell (Moore & 
Newell, 1973; Newell, 1982, 1990; Newell etal., 1992) 
developed a set of ideas about understanding how 
physical systems could be scientifically characterized 
as knowledge systems. A parallel set of ideas was de- 
veloped by Dennett (1988) in his discussion of inten- 
tional systems. 9 The notions developed by Newell and 
Dennett derive from the philosophical contributions of 
Brentano (1874/1973). The knowledge level was de- 
veloped by Newell (1982) as a way to address questions 
about the nature of knowledge and the nature of sci- 
entifically ascribing knowledge to an agent. 

In the frame of reference developed by Newell and 
Dennett, scientific observers ascribe knowledge to be- 
having systems. A key assumption is that knowledge- 
level systems can be specified completely by reference 
to their interaction with the external world, without 
reference to the mechanical means by which the in- 
teractions take place. A knowledge-level system con- 
sists of an agent behaving in an environment. The 
agent consists of a set of actions, a set of perceptual 
devices, a goal (of the agent), and a body of knowledge. 
The operation of such systems is governed by the 
principle of rationality: If the agent knows that one of 
its actions will lead to a situation preferred according 
to its goal, then it will intend the action, which will 
then be taken if it is possible. As Newell (1982) stated, 
knowledge is "whatever can be ascribed to an agent, 
such that its behavior can be computed according to 
the principle of rationality" (p. 105). In essence, the 
basic observations at the knowledge level are state- 
ments of the form: 
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In situation S, agent A behaves as if it has knowl- 
edge K. 

Value and Structure of Knowledge 

New knowledge is the most valuable commodity on earth. 
The more truth we have to work with, the richer we 
become. — Kurt Vonnegut, Breakfast of Champions 

Our ability to solve ill-structured problems such 
buying a house, finding a job, or throwing a Super 
Bowl party is, in large part, a reflection of the par- 
ticular external knowledge used to structure and solve 
the problem. Consequently, the value of external con- 
tent may often ultimately be measured in the im- 
provements to the outcomes of an embedding task. 
The value of knowledge gained may be measured in 
terms of what additional value it attains for the agent. 
Of course, a lot of external content provides no new 
knowledge (e.g., perhaps it is "old news" to us), or 
information that does not contribute to our goals. 

In simple well-structured problems, the value of 
knowledge gained from information foraging can be 
generally expressed as a difference between two strat- 
egies: one that rationally uses knowledge acquired by 
foraging from external information sources to choose 
among outcomes, and another that does not use such 
information. 10 For instance, suppose a man who has a 
budget wants to purchase a product on the Web and 
knows of a price comparison Web site (e.g., as in 
the hotel illustration above). If blindly purchasing a 
product costs a certain expected amount X, but after 
visiting the price comparison Web site the man will be 
able to find a less expensive product Y, then the net 
value of that knowledge will be X - Y - C, where C is 
some measure of the cost of gaining the knowledge. If 
the analysis in the hotel price illustration above were 
correct, then the expected price of a hotel (without 
knowledge) would have been about $86 (see the ap- 
pendix), but after looking at a Web site, the price 
would have been $66, and the time cost would be 
approximately 1 3 min/60 min x $10/hr = $2, so the 
value of the Web site knowledge would be $86- 
$66- $2 = $18. In simple cases such as these, one 
may imagine that a person could completely con- 
struct a decision model in which all possible decision 
outcomes are specified, as well as the relationships 
among information sources, potential results from 
those sources, and the relation of information results 
gathered to decisions and the utility of those decisions. 
Indeed, artificial intelligence systems (e.g., Grass & 
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Zilberstein, 2000) have been developed to use this 
approach to tackle problems such as purchasing a 
digital camera, purchasing a removable media device, 
or choosing a restaurant. Real-world problems, how- 
ever, typically require a more complicated analysis of 
the value of knowledge. 

Knowledge and Intelligence 

Knowledge is of two kinds: we know a subject ourselves, 
or we know where we can find information upon it. 
— Samuel Johnson 

Physically instantiated cognitive systems are limited in 
their ability to behave as rational knowledge-level sys- 
tems. Newell (1990) proposed that "intelligence is the 
ability to bring to bear all the knowledge that one has in 
service of one's goals" (p. 90). 11 This corresponds to 
our everyday notion that we can behave more intelli- 
gently by being better informed. In the idealized view 
of the knowledge level, everything in a body of 
knowledge (including all possible entailments) is in- 
stantly accessible. However, people, or any physical 
system, can only approximate such perfect intelligent 
use of knowledge because the ability to bring forth the 
right knowledge at the right time is physically limited. 
The laws of physics limit the amount of information 
that can be stored or processed in a circumscribed 
portion of space and time. Within those limits, how- 
ever, intelligence increases with the ability to bring to 
bear the right knowledge at the right time. 

Dennett (1991, pp. 222-223) notes that this con- 
ception of knowledge and intelligent reasoning goes 
back to Plato (Theaetetus, 197-198a, Cornford 
translation). Plato saw knowledge as something that 
one could possess like a man who keeps captured 
wild birds in an aviary. There is a sense in which the 
man has the birds, but a sense in which he has none 
of them until he can control each bird by calling 
forth the bird at will. Plato saw intelligent reasoning 
as not only having the birds but also having the 
control to bring forth the right bird at the right time. 

Newell's discussions focused on unaided intelli- 
gent systems (people or computer programs) and the 
knowledge that they had available in their local 
memories. But there is a sense in which the world 
around us provides a vast external memory teeming 
with knowledge that can be brought forth to remedy a 
lack on the part of the individual. We can extend 
NewelFs notion of intelligence and argue that intel- 
ligence is improved by enhancement of our ability to 



bring forth the right knowledge at the right time from 
the external world. Of course, the world (both phys- 
ical and virtual) shapes the manner in which we can 
access and transform knowledge-bearing content and 
thus shapes the degree to which we reason and be- 
have intelligently. The task of acquiring knowledge 
from external sources is itself a task that can be per- 
formed more or less intelligently. 

Consider the illustration above in which a hypo- 
thetical user searches for hotel prices on the Web. 
From a knowledge-level perspective, the user has 
knowledge of how to navigate the Web, operate the 
Web site search engine, and perform price compari- 
sons. The illustration assumed that the user applies 
this knowledge flawlessly, but the structure of the 
Web environment determines the rate at which new 
knowledge (of hotel prices) is gained. A different de- 
sign could improve the rate at which the user accom- 
plishes the task. For instance, if the Web site sorted 
hotels by both quality (star rating) and price, the user 
could accomplish the task much faster. Although the 
user's navigation and calculation knowledge has not 
changed, it is being applied more efficiently because of 
a change in the information environment. In other 
words, a change in the information environment has 
made the user more intelligent. 

Rational Analysis 

Anderson's rational analysis approach is a specific 
version of methodological adaptationism applied to 
the development of cognitive theory. It was inspired 
by Marr's (1982) influential approach to computer 
vision, in which Marr argued that visual processing 
algorithms (and other intelligent information pro- 
cesses) are "likely understood more readily by un- 
derstanding the nature of the problem being solved 
than by examining the mechanism (and the hard- 
ware) in which it is solved" (p. 92). 12 The term "ra- 
tional analysis" was inspired by rational choice theory 
in economics, in which people are assumed to be 
rational decision makers who optimize their behav- 
ioral choices in order to maximize their goals (utility). 
In rational analysis, however, it is not the person who 
is the agent of rational choice, but rather it is the 
selective forces of the environment that choose better 
biological and behavioral designs. 

Anderson has used rational analysis to study the 
human cognitive architecture by assuming that nat- 
ural information processing mechanisms involved in 
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such functions as memory (Anderson & Milson, 
1989; Anderson & Schooler, 1991) and categoriza- 
tion (Anderson, 1991) were well designed by evolu- 
tionary forces to meet the problems posed by the 
environment. The key assumption behind rational 
analysis could be stated as 

Principle of rationality: The cognitive system op- 
timizes the adaptation of the behavior of the or- 
ganism. 

As developed by Anderson (1990), rational analysis 
requires a focus on understanding the structure and 
dynamics of the environment. This understanding 
provides a rationale for the design of information 
processing mechanisms. Anderson proposed the fol- 
lowing recipe for rational analysis: 

1. Precisely specify the goals of the agent. 

2. Develop a formal model of the environment to 
which the agent is adapted. 

3. Make minimal assumptions about the compu- 
tational costs. 

4. Derive the optimal behavior of the agent 
considering items 1-3. 

5. Test the optimality predictions against data. 

6. Iterate. 

Note, generally, the emphasized focus on optimal 
behavior under given goals and environmental con- 
straints and the minimal assumptions about the 
computational structure that might produce such 
behavior. 

Probabilistically Textured Environments 

Interaction with the information environment differs 
in a fundamental way from well-defined task envi- 
ronments that have been the dominant paradigms in 
HCI, such as expert text editing (Card et al., 1983) or 
telephone assistance (Gray et al., 1993). In contrast to 
such tasks — in all but the most trivial cases — the in- 
formation forager must deal with a probabilistically 
textured information environment (Brunswik, 1952). 
In contrast to application programs such as text editors 
and spreadsheets, in which actions have fairly deter- 
minate outcomes, 13 foraging through a large volume 
of information involves uncertainties— for a variety of 
reasons— about the location, quality, relevance, ve- 
racity, and so on, of the information sought and the 
effects of foraging actions. The ecological rationality 
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of information foraging behavior must be analyzed 
through the theoretical lens and tools appropriate to 
decision making under uncertainty. The determinate 
formalisms and determinate cognitive mechanisms 
that are characteristic of the HCI paradigm are inad- 
equate for the job of theorizing about information 
foraging in probabilistically textured environments. 
Models developed in Information Foraging Theory 
draw upon probabilistic models, and especially 
Bayesian approaches, and they bear similarity to eco- 
nomic models of decision making (rational choice) 
under uncertainty and engineering models. 

Role of Optimization Analysis 

Optimization models 14 are a powerful tool for study- 
ing the design features of organisms and artifacts. 
Consequently, optimization models are often found 
in the toolbox of the methodological adaptationist 
(e.g., as found in Anderson's rational analyses). Op- 
timization models are mathematical models bor- 
rowed from engineering and economics. They are 
used to model a rational decision process faced with a 
problem and constraints. In engineering, they are 
used as a tool for quantifying the quality of design 
alternatives with respect to some problem specifica- 
tion. In economics, they are used typically to char- 
acterize a rational decision maker choosing among 
courses of action in order to maximize utility (a ra- 
tional choice model), often operating in situations of 
limited or uncertain knowledge about possible out- 
comes. Optimization models in general include the 
following three major components: 

• Decision assumptions that specify the decision 
problem to be analyzed, such as the amount of 
time to spend on an activity, or whether or not to 
pursue a particular type of information content. 

• Currency assumptions, which identify how choices 
are to be evaluated, such as time or money or 
other resources. 

• Constraint assumptions, which limit and define 
the relationships among decision and currency 
variables. Examples of constraints include the rate 
at which a person can navigate through an infor- 
mation access interface, or the value of results 
returned by bibliographic search technology. 

All cognitive agents must reason about the world with 
limited time, knowledge, and computational power. 
Consequently, the use of optimization models can- 
not be taken as a hypothesis that human behavior is 
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omnisciently rational, with perfect information and 
infinite computational resources. Indeed, unbounded 
optimization models are likely to fail in predicting 
any complex behavior. Anderson's ( 1 990) rational anal- 
ysis approach is based on optimization under con- 
straints. The basic idea is that the constraints of the 
environment place important shaping limits on the 
optimization that is possible. 

Optimization models, such as rational choice 
models from economics, allow us to define the be- 
havioral problems that are posed by the environment, 
and they allow us to determine how well humans (or 
animals or other cognitive agents) perform on those 
problems. This does not mean that one assumes that 
the cognitive agent is performing the same calcula- 
tions as the optimization models. It is possible that 
simple mechanisms and heuristics may achieve op- 
timal or near optimal performance once the limits of 
the environment are taken into account (Todd & 
Gigerenzer, 2000). This is the essence of bounded 
rationality and the notion that real cognitive agents 
make choices based on satisficing (Simon, 1955). 

Generally, "One does not treat the optimization 
principle as a formula to be applied blindly to any 
arbitrarily selected attribute of an organism. It is nor- 
mally brought in as a way of expanding our under- 
standing from an often considerable base of knowl- 
edge" (Williams, 1992, p. 62). As eloquently stated by 
the evolutionary theorist G. C. Williams (1992), 

Organisms are never optimally designed. Designs 
of organs, developmental programs, etc. are lega- 
cies from the past and natural selection can affect 
them in only two ways. It can adjust the numbers 
of mutually exclusive designs until they reach 
frequency-dependent equilibria, often with only 
one design that excludes alternatives. It can also 
optimize a design's parameters so as to maximize 
the fitness attainable with that design under cur- 
rent conditions. This is what is usually meant by 
optimization in biology. An analogy might be the 
common wooden-handled, steel-bladed tool de- 
sign. With different parameter values it could be 
a knife, a screw driver, or many other kids of tool — 
many, but not all The fixed-blade constraint 
would rule out turning it into a drill with meshing 
gears. The wood-and-steel constraint would rule it 
out as a hand lens. (p. 56, emphasis original) 

Activities can be analyzed according to the value 
of the resource currency returned and costs incurred. 
Generally, one considers two types of costs: (1) re- 



source costs and (2) opportunity costs (Hames, 1992). 
Resource costs are the expenditures of calories, money, 
and so forth, that are incurred by the chosen activity. 
Opportunity costs are the benefits that could be 
gained by engaging in other activities but are for- 
feited by engaging in the chosen activity. For in- 
stance, junk mail incurs a resource cost in terms of 
the amount of money (not to mention trees) involved 
in delivering the junk, but it also incurs an oppor- 
tunity cost for the recipients who read the junk be- 
cause they have forgone gains that could have been 
made by engaging in other activities. 

Production System Theories of Cognition 

Production systems have had a successful history in 
psychology (Anderson et al., 2004; Neches, Langley, 
& Klahr, 1987) since their introduction into the field 
by Newell (1973a). The ACT family of production 
system theories has the longest history of these kinds of 
cognitive architectures. The seminal version of the 
ACT theory was presented in Anderson (1976), shortly 
after NewelFs (1973b) challenge to the field of cogni- 
tive psychology to build unified theories of cognition, 
and it has undergone several major revisions since 
then (Anderson, 1976, 1983, 1990, 1993; Anderson 
etal., 2004; Anderson & Lebiere, 1998). Until recently, 
it has been primarily a theory of higher cognition and 
learning, without the kind of emphasis on perceptual- 
motor processing found in EPIC (Kieras & Meyer, 
1997) or MHP (Card et al, 1983). The success of ACT 
as a cognitive theory has been historically in the study 
of memory (Anderson & Milson, 1989; Anderson & 
Pirolli, 1984), language (Anderson, 1976), problem 
solving (Anderson, 1993), and categorization (Ander- 
son, 1991). As a learning theory, ACT has been suc- 
cessful (Anderson, 1993) in modeling the acquisition 
of complex cognitive skills for tasks such as computer 
programming, geometry, and algebra and in under- 
standing transfer of learning across tasks (Singley & 
Anderson, 1989). ACT has been strongly tested (An- 
derson, Boyle, Corbett, & Lewis, 1990) by application 
in the development of computer tutors, and less so in 
the area of HCI. The production system models pre- 
sented in this book are extensions of the ACT 1 theory. 

Figure 1.7 presents the basic cognitive architec- 
ture used in this book. It couples the basic ACT-R 
architecture to a module that computes information 
scent (a kind of utility metric), which for convenience I 
will call the ACT-Scent 15 architecture. This book 
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figure 1.7 The ACT-Scent cognitive architecture. 
Information perceived from the external world is 
encoded into chunks in declarative memory. Goals 
and subgoals controlling the flow of cognitive behav- 
ior are stored in goal memory. The system matches 
production rules in production memory against goals 
and activated information in declarative memory, and 
those that match form a conflict set. The matched 
rule instantiations in the conflict set are evaluated by 
utility computations performed in the information 
scent module. Based on the utility evaluation, a single 
production rule instantiation is executed, updates are 
made to goal memory and declarative memory, if 
necessary, and the cycle begins again. ACT-Scent uses 
a process called spreading activation to retrieve 
information (in declarative memory) and to evaluate 
productions (in the information scent module). 



presents specific models of Web foraging (SNIF-ACT 
1.0 and SNIF-ACT 2.0) and Scatter/Gather (Cutting, 
Karger, Pedersen, & Tukey, 1992) browsing (ACT-IF) 
that were developed within the ACT-Scent architec- 
ture. The architecture includes a declarative memory 
containing chunks, a procedural memory containing 
production rules, and a goal stack containing the hier- 
archy of intentions driving behavior. The information 
scent module is a new addition to ACT that is used to 
compute the utility of actions based on an analysis of 
the relationship of content cues from the user interface 
to the users goals. The theory behind this module is 
described in detail in chapter 4. 

Summary 

Humans are informavores. We adapt to the world by 
seeking and using information. As a result, we create 
a glut of information. This causes a poverty of at- 
tention and a greater need to allocate that attention 
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effectively and efficiently. Information Foraging The- 
ory is being developed to understand and improve 
human-information interaction. It borrows from op- 
timal foraging theory, but it assumes that humans 
optimize the gain of information per unit time cost. 
The following chapters deal with various applications 
of the framework, method, and theory. This includes 
analyses of information foraging on the Web, in 
document browsers, and in social networks. In addi- 
tion, I discuss design and engineering applications of 
the theory that illustrate its practical utility. 

APPENDIX 

The analysis presented in this section is provided for 
those readers with a background that includes exposure 
to basic probability theory and who are interested in the 
mathematics involved in calculating the expected value 
of searching for better hotel prices in the illustration. 

The observed frequency distribution of Paris two- 
star hotel prices presented in figure 1.2 is presented 
in figure l.A.l. Also shown in figure l.A.l is a best-fit 
lognormal distribution, which is typically found for 
commodity prices and would probably be character- 
istic of many of the things that one could buy on the 
Web. The estimate was performed by starting with 
the maximum likelihood estimates, which can be biased 
for small samples, and then adjusting the parameters 
slightly to obtain best linear fits on a Q-Q plot. 

A variable X (e.g., prices) is lognormal distributed 
if the natural log of X, ln(X), is normal distributed. 
The probability density function of the lognormal 
distribution is 

f(i) = -LrW«)-^ ( (i.a.1) 

xoyln 

where ^ is the mean of In(X) and a is the standard 
deviation of ln(X). For the prices in figure l.A.l, 
/i = 4.45 and <7 = 0.13. The cumulative distribution 
function, F(x), for the lognormal is typically com- 
puted numerically using the cumulative distribution 
function <I> for the normal distribution, 

F(x)=o(^). (1.A.2) 

The expected value of a lognormal distributed vari- 
able X is 

E(X) = ^ +ff2 ^ (1.A.3) 
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figure 1 .A. 1 The observed distribution of Paris two- 
star hotel prices is approximately lognormal, which is 
typical of commodity prices. 



and the variance is 



var(X) : 



l)e 2 " + fl 



(1A4) 



The distribution in figure l.A.l has an expected value 
of $86.35 and a variance of $127.09. 

TTie expected minimum price in figure 1.3 and 
expected savings in figure 1.4 were computed from 
the probability density function of minimum values. 
Assume that prices are sampled n times from a ran- 
dom variable, such as X characterized above. The 
minimum value of that sample of size n can be 
characterized as another random variable Y„, 

Y n =min{X f ,X 2 , ...,X„}, (1A5) 

where the X, are independent random draws from 
the random variable X. From the basic definitions of 
probability, the cumulative density function for the 
minimum of a random sample of size n, Y n , is defined as 
the probability that a randomly sampled value (mini- 
mum prices in this case) will be less than some value y, 

C n (y)=Pr(Y n <y), (1.A.6) 

which is equivalent to the probability that the mini- 
mum Y„ is not greater than y, 



C n (y)=l-Pr(Y n >y). 



(1.A.7) 



The probability, Pr(Y„ > y), that the minimum value 
of a sample is greater than some value y would be the 



same as the probability that every sampled value from 
the random variable X was greater than y, so 



Pr(Y„>y)=Pr(Xi>y).Pr(X 2 >y).. 
Pr(X n >y) 
- Pr(X>y) n . 



(1.A.8) 



Since the meaning of the cumulative density func- 
tion for X is 



F(x) = Pr(X < x), 



one can define 



Pr(X>yHl-F(y). 



(1.A.9) 



(1.A.10) 



Now, one can substitute equation LA. 10 into 1.A.8 
into 1.A.7 to get 



C n (y)=Pr(Y„<y) 
= 1-Pr (Y n >y) 

-1- Pr(X>y) n 
= 1-[1 -F(y)] n 



(1.A.11) 



The probability density function is defined as the 
derivative of the cumulative density function. So, 
taking the derivative of equation l.A.l 1, the pro- 
bability density function of the random variable Y n 
representing the minimum of a sample of size n 
drawn from variable X will be 



gn (yHn[l-F(y)rY(y), 



(1A12) 



where the probability density function f(x) and cu- 
mulative density function F(x) are for the sampled 
random variable X. The expected minimum prices 
and expected savings in figures 1.3 and 1.4 were 
computed using equation 1.A.5 assuming the 
probability density function and cumulative distri- 
bution function in equations l.A.l and 1.A.2, with 
the parameters /x = 4.45 and <r = 0.13 estimated in 
fitting the lognormal in figure l.A.l. 

The utility of production "P3: Next-link" in table 
1.1 was computed by determining the expected sav- 
ings that would be attained by randomly sampling the 
lognormal distribution of prices in figure l.A.l while 
having a minimum price m already in hand. This 
expected savings can be computed by integrating over 
all savings achieved by prices less than m and greater 
than 0, weighted by the probability of getting those 
lower prices. So the expected savings to be achieved 
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by a randomly sampled price x given that one has a 
current minimum price m in hand is 

f (m-x)f{x)dx. (1.A.13) 
o 

Given the lognormal distribution of prices in figure 
1A1, if the lowest price found so far were $100, then 
the expected savings of taking looking at the next 
price would be 

S($100) = $14.43. 
Some other example expected savings would be 

S(S90) = $6.62 
S($80) = $1.86 
S($70) = $0.23 

Notes 

1. http://www-2.cs.cmu.edu/~hzhang/Newell.Good 
Science. 

2. This example is inspired by a microeconomic 
analysis of the value of information in consumer pur- 
chasing by Stigler (1961). 

3. For early uses of production systems in psychol- 
ogy, see Newell (1973a) and Newell and Simon (1972). 
For overviews and history of their use in psychology, see 
Anderson (1993), and Klahr, Langley, and Neches (1987). 

4. For those familiar with ACT-R 5.0, the produc- 
tions run without the perceptual-motor modules or the 
subsymbolic computations. 

5. Data provided courtesy of Suresh Bhavnani. 

6. 1 purposely use the phrase "maximization ten- 
dency" because of the assumption that this is an ongoing 
process limited by physical and biological bounds on 
instantaneously achieving omniscient optimal ity. It is a 
bounded rationality process. 

7. The implicit assumption is that energy translates 
into fitness. 

8. As far as 1 can tell, the term "human-information 
interaction" first appeared in the public literature in the 
title of Gershon (1995). 

9. To clarify terminology, what 1 am calling "knowl- 
edge" corresponds to Newells (e.g., 1982, 1990) use of the 
term. This, in rum, corresponds to Dennett's use of "belief," 
which is consistent with common philosophical usage. 

10. This definition is based on Pearl (1988, pp. 313- 
314). 
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11. Newell's technical definition was that "[a] sys- 
tem is intelligent to the degree that it approximates a 
knowledge-level system" (Newell, 1990). Knowledge- 
level systems are discussed below. 

12. See Glimcher (2003) for how Marr's work in- 
spired a parallel rational analysis approach to under- 
standing neuroscience. 

13. Barring bugs, of course. 

14. Following natural selection theorist G. C. Wil- 
liams (1992), 1 prefer the term "optimization model" 
over "optimal ity model" to acknowledge a focus on cor- 
rective processes rather than optimal end states. 

15. Pronounced "accent." 
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